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@ Seed specific transcriptional regulation. 

@ Nucleic acid sequences and methods for their use are 
provided which provide for seed specific transcription, in order 
to modulate or modify expression In seed, particularly embryo 
cells. Transcriptional initiation regions are identified and 
isolated from plant cells and used to prepare expression 
cassettes which may then be transformed into plant cells for 
seed specific transcription. The method finds particular use In 
conjunction with modifying fatty acid production in seed tissue 



Q. 
LU 



Bundesdrvcfcere* Bertn 



BEST AVAILABLE COPY 



0 255378 



Description 

SEED SPECIFIC TRANSCRIPTIONAL REGULATION 



INTRODUCTION 

5 

Technical Field 

Genetic modification of plant material is provided for seed specific transcription. Production of endogenous 
products may be modulated or new capabilities provided. 

10 Background 

The primary emphasis in genetic modification has been directed to prokaryotes and mammalian cells. For a 
variety of reasons plants have proven more intransigent than other eukaryotic cells in the ability to genetically 
manipulate the plants. In part, this has been the result of the different goals involved, since for the most part 
plant modification has been directed to modifying the entire plant or a particular plant part in a live plant, as 

15 distinct from modifying cells in culture. 

For many applications, it will be desirable to provide for transcription in a particular plant part or at a 
particular time in the growth cycle of the plant. Toward this end, there is a substantial interest in identifying 
endogenous plant products whose transcription or expression are regulated In a manner of interest In 
identifying such products, one must first look for products which appear at a particular time in the celt growth 

20 cycle or in a particular plant part, demonstrate its absence at other times or in other parts, identify nucleic acid 
sequences associated with the product and then identify the sequence in the genome of the plant in order to 
obtain the S'-untranslated sequence associated with transcription. This requires substantial investigation in 
first identifying the particular sequence, followed by establishing that It Is the correct sequence and isolating 
the desired transcriptional regulatory region. One must then prepare appropriate constructs, followed by 

25 demonstration that the constructs are efficacious In the desired manner. 

Identifying such sequences Is a challenging project, subject to substantial pitfalls and uncertainty. There is. 
however, substantial interest in being able to genetically modify plants, which justifies the substantial 
expenditures and efforts in identifying transcriptional sequences and manipulating them to determine their 
utility. 

30 

Relevant literature 

Crouch et at.. In: Molecular Form and Function of the Plant Genome , eds van Vloten-Doting, Groot and Hall, 
Plenum Publishing Corp. 1985, pp 555-566; Crouch and Sussex, Planta (1981) 153:64-74; Crouch etal., J. Mol. 
Appl. Genet. (1983) 2573-283; and Simon et al.. Plant Molecular Biology (1985) 5: 191-201, describe various 
35 aspects of Brassica napus storage proteins. Beachy et ah, EMBO J. (1985) 4:3047^3053; Sengupta-Gopalan et 
al, Proc. Natl. Acad. Sci. USA (1985) 82:3320-3324; Greenwood and Chrispeels, Plant Physiol (1985) 79:65-71 
and Chen et a}., Proc. Natl. Acad Sci. USA (1986) 83:8560-8564 describe studies concerned with seed storage 
proteins and genetic manipulation. Eckes et al.. Mol. Gen. Genet. (1986) 205:14-22 and Fluhr et al., Science 
(1986) 232:1106-1112 describe the genetic manipulation of light inducible plant genes. 

AO 

SUMMARY OF THE INVENTION 

DNA constructs are provided which are employed in manipulating plant cells to provide for seed-specific 
transcription. Particularly, storage protein transcriptional regions are joined to other than the wild-type gene 
and introduced into plant genomes to provide for seed-specific transcription. The constructs provide for 
45 modulation of endogenous products as well as production of heterologous products. 

DESCRIPTION OF THE SPECIFIC EMBODIMENTS 

Novel DNA constructs are provided which allow for modification of transcription in seed, particularly in 
embryos during seed maturation. The DNA constructs comprise a regulated transcriptional initiation region 

50 associated with seed formation, preferably in association with embryogenesis and seed maturation. Of 
particular interest are those transcriptional initiation regions associated with storage proteins, such as napin, 
crucif erin, p-congtycinin, phaseolin, or the like. The transcriptional Initiation regions may be obtained from any 
convenient host, particularly plant hosts such as Brassica , e.g. napus or campestris , soybean ( Glycine max), 
bean ( Phaseolus vulgaris ), corn ( Zea mays ), cotton ( Gossypium sp.). safflower ( Carthamus tinctorius ), tomato 

55 ( Lycopersican esculentum ), and Cuphea species. 

Downstream from and under the transcriptional Initiation regulation of the seed specific region will be a 
sequence of interest which will provide for modification of the phenotype of the seed, by modulating the 
production of an endogenous product, as to amount, relative distribution, or the like, or production of a 
heterologous expression product to provide for a novel function or product in the seed. The DNA construct will 

60 also provide for a termination region, so as to provide an expression cassette into which a gene may be 
introduced. Conveniently, transcriptional initiation and termination regions may be provided separated In the 
direction of transcription by a linker or polylinker having one or a plurality of restriction sites for insertion of the 
gene to be under the transcriptional regulation of the regulatory regions. Usually, the linker will have from 1 to 
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10, more usually from about 1 to 8. preferably from about 2 to 6 restriction sites. Generally, the linker will be 
fewer than 100 bp, frequently fewer than 60 bp and generally at least about 5 bp. 

The transcriptional initiation region may be native or homologous to the host or foreign or heterologous to 
the host. By foreign is intended that the transcriptional initiation region Is not found in the wild-type host Into 
which the transcriptional Initiation region Is introduced. . 5 . 

Transcriptional initiation regions of particular interest are those associated with the Brassica napus or 
campestris napin genes, acyt carrier proteins, genes that express from about day ? td day 40 |n seed,, 
particularly having maximum expression from about day 10 to about day 20, where the expressed gene Is riot 
found in leaves, while the expressed product Is found In seed in high abundance. . . _ . 

The transcriptional cassette will include in the 5'-3' direction of transcriptidhr a transcriptional arid 10 
translation^! initiation region, a sequence of Interest, and a transcriptional and translatiohai termination region 
functional in plants. One or more introns may also be present The DNA sequence rnaytiave any open read&ig 
frame encoding a peptide of interest, e.g. an enzyme, or a sequence complementary to a genomic sequence,; •' 
where the genomic sequence may be an open reading frame, an intron, a Hori^odFng leader sequence, or any 
other sequence where the complementary sequence will Inhibit transcription, messertger RNA processing, is 
e.g. splicing, or translation. The DNA sequence of interest rhay be synthetic, naturally dertved, or combinations - 
thereof. Depending upon the nature of the DNA sequence of Interest, it may be desirable to synthesize the 
sequence with plant preferred codons. The plant preferred codons may be deterrfijne^frbVn the cbdorW;<>f . 
highest frequency in the proteins expressed in the largest amount In me particular plaftt species of Ihtefeat. * 

In preparing the transcription cassette, the various DNA fragments may be martlpuJate^rso as 'to provide for \ . 2b 
the DNA sequences In the proper orientation and, as appropriate, fn the proper; reading frame. Toward this ^ 
end, adapters or linkers may be employed for joining the DNA fragments of btheV n^ptHatlons may be ? 
involved to provide for convenient restriction sites.-removal of superfluous DNA,.removar of restriction sites, or 
the like. For this purpose, in vitro mutagenesis, primer repair, restriction, annealing, rejection, ligation^ or the 
like may be employed, where insertions, deletions or substitutions, e.g. transitions and transversions, may^e 25 
involved. . • • 

The termination region which Is employed will be primarily one of convenience, since the termination regions 
appear to be relatively Interchangeable. The termination region may be native with the transcriptional Initiation 
region, may be native with the DNA sequence of Interest, or may be derived frorri another source. Coriverifefrt 
termination regions are available from the ThplasmkJ of A. tumefaciens , such as the octoptne syftthase dhd 3d 
nopaline synthase termination regions. ... .! 

By appropriate manipulations, such as restriction, chewing back or fliHrig in overhangs t6 provide blunt 
ends, ligation of linkers, or the like, complementary ends of the fragments can be provided for Joining arid 
ligation. 

In carrying out the various steps, cloning Is employed, so as to amplify the amount of DNA and to allow for 35 
analyzing the DNA to ensure that the operations have occurred in a proper manner. A wide variety of cloning 
vectors are available, where the cloning vector Includes a replication system functional bi E. cojl and a marker 
which allows for selection of the transformed ceHs. Illustrative vectors Include p>BR332, pUC series, M13(np . 
series, pACYC184, etc. Thus, the sequence may be Inserted Into the vector at an expropriate restriction 
site(s). the resulting plasmid used to transform the E. coli host, the E. coli grown In an appropriate nutrient 40 
medium and the cells harvested and lysed and the plasmid recovered. Analysis may Involve sequence analysis, 
restriction analysis, electrophoresis, or the like. After each manipulation the DNA sequence to be used In the . " ' " 
final construct may be restricted and joined to the next sequence, where each of the partial constructs may be 
cloned in the same or different plasmids. 

In addition to the transcription construct, depending upon the manner of introduction of the transcription 43 
construct into the plant, other DNA sequences may be required. For example, when using the Th or Ri-ptasmld 
for transformation of plant cells, as described below, at least the right border and frequently both the right a 
left borders of the T-DNA of the Ti-and Ri-plasmlds will be joined as flanking regions to the transcription 
construct. The use of T-DNA for transformation of plant cells has received exlenslvfe study and Is amply 
described in EPA Serial No. 120,516. Hoekema, In: The Binary Plant Vector System dftset-drukkerij Kanters SO 
B.V.. Aiblasserdam, 1985. Chapter V. Fraley, et al.. Crit Rev. Plant Scl. , 4:1-46, and An et al, EMBO J. (1985) 
4:277-284. " * 

" Alternatively, to enhance integration Into the plant genome, terminal repeats of transposoris may be used as 
borders in conjunction with a transposase. In this situation, expression of the ^ransposase should be 
inducible, or the transposase inactivated, so that once the transcription construct Is Integrated into the 55 
genome, it should be relatively stably Integrated and avoid hopping. 

The transcription construct will normally be joined to a marker for selection in plant cells. Conveniently, the 
marker may be resistance to a biocide, particularly an antibiotic, such as kahamycln, Q418, bleomycin, 
hygromycin, chloramphenicol, or the like. The particular marker employed wHl be one which will allow for 
selection of transformed cells as compared to cells lacking the DNA which has been Introduced. * 60 

A variety of techniques are available for the Introduction of CNA into a plant cell frdst/These techniques 
include transformation with Ti-DNA employing A. tumefaciens or A. rhrzogeries as this transforming agent, 
protoplast fusion, injection, electroporation. etc. For transformation With Agrdbacterlum , plasmids can be 
prepared in E. coli which plasmids contain DNA homologous with the Tl-plasmld," particularly T-DNA. The 
plasmid may or may not be capable of replication in Agrobacterium , that is. it may or may hot have a broad 65. 
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spectrum prokaryotic replication system, e.g. RK290, depending in part upon whether the transcript™ 
construct is to be integrated into the Ti-plasrnid or be retained on an independent ptasmid By means of a 
helper ptasmid, the transcription construct may be transferred to the A. tumefaciens and the resulting 
transformed organism used for transforming plant cells. 
5 Conveniently, explants may be cultivated with the A. tumefaciens or A. rhizogenes to allow for transfer of the 
transcription construct to the plant cells., the plant cells dispersed in an appropriate selective medium for 
selection, grown to callus, shoots grown and plantlets regenerated from the shoots by growing in rooting 
medium The Agr obacterium host will contain a plasmid having the vjr genes necessary for transfer of the 
T-DNA to the plant cells and may or may not have T-DNA. For injection and electroporation. disarmed 
10 TH>lasmids (lacking the tumor genes, particularly the T-DNA region) may be introduced into the plant cell. 
The constructs may be used in a variety of ways. Particularly, the constructs may be used to modify the fatty 
acid composition in seeds, that is changing the ratio and/or amounts of the various fatty acids, as to length, 
unsaturation. or the like. Thus, the fatty acid composition may be varied, enhancing the fatty acids of from 10 to 
14 carbon atoms as compared to the fatty acids of from 16 to 18 carbon atoms, increasing or decreasing fatty 
15 acids of from 20 to 24 carbon atoms, providing for an enhanced proportion of fatty acids which are saturated or 
unsaturated, or the like. These results can be achieved by providing for reduction of expression of one or more 
endogenous products, particularly enzymes or cofactors. by producing a transcription product which is 
complementary to the transcription product of a native gene, so as to inhibit the maturation and/or expression 
of the transcription product, or providing for expression of a gene, either endogenous or exogenous 
20 associated with fatty acid synthesis. Expression products associated with fatty acid synthesis include acyl 
carrier protein, thioesterase. acetyl transacylase. acetyl-coA carboxylasem. ketoacyl-synthases, malonyl 
transacylase stearoyt-ACP desaturase. and other desaturase enzymes. . 

Alternatively one may wish to provide various products from other sources including mammals, such as 
blood factors, 'lymphoklnes, colony stimulating factors, interferons, plasminogen activators, enzymes, e.g. 
25 superoxide dismutase, chymosin, etc.. hormones, rat mammary thioesterase 2, phospholipid acyl desaturases 
involved in the synthesis of cicosapentaenoia acid, human serum albumin. Another purpose is to increase the 
level of seed proteins, particularly mutated seed proteins, having an improved amino acid distribution which 
would be better suited to the nutrient value of the seed. In this situation, one might provide for inhibition of the 
native seed protein by producing a complementary DNA sequence to the native coding region or non~coding 
30 region, where the complementary sequence would not efficiently hybridize to the mutated sequence, or 
inactivate the native transcriptional capability. 

The cells which have been transformed may be grown into plants in accordance with conventional ways 
See, for example, McCormick et ai. f Plant Cell Reports (1986) 5:81-84. These plants may then be grown, and 
either pollinated with the samelr"5nsformed strain or different strains, identifying the resulting hybnd having 
35 the desired phenotypic characteristic. Two or more generations may be grown to ensure that the subject 
phenotypic characteristic is stably maintained and inherited and then seeds harvested to ensure the desired 
phenotype or other property has been achieved. 

As a host cell, any plant variety may be employed which provides a seed of interest. Thus, for the most part, 
plants will be chosen where the seed is produced in high amounts or a seed specific product of interest ts 
40 involved. Seeds of interest include the oil seeds, such as the Brassica seeds, cotton seeds, soybean, 
safflower, sunflower, or the like; grain seeds, e.g. wheat, barley, rice, clover, corn, or the like. 

Identifying useful transcriptional initiation regions may be achieved in a number of ways. Where the seed 
protein has been or is isolated, it may be partially sequenced, so that a probe may be designed for identifying 
messenger RNA specific for seed. To further enhance the concentration of the messenger RNA specifically 
45 associated with seed, cDNA may be prepared and the cDNA subtracted with messenger RNA or cDNA from 
non-seed associated cells. The residual cDNA may then be used for probing the genome for complementary 
sequences, using an appropriate library prepared from plant cells. Sequences which hybridize to the cDNA 
may then be isolated, manipulated, and the 5'-untranslated region associated with the coding region isolated 
and used in expression constructs to Identify the transcriptional activity of the 5'-untranslated region. 
SO In some instances, the research effort may be further shortened by employing a probe directly for screening 
a genomic library and identifying sequences which hybridize to the probe. The sequences will be manipulated 
as described above to identify 5'-untranslated region. 

Trie expression constructs which are prepared employing the 5'-untranslated regions may be transformed 
into plant cells as described previously for determination of their ability to function with a heterologous 
55 structural gene (other than the wild-type open reading frame associated with the 5'-untranslated region) and 
the seed specificity. In this manner, specific sequences may be identified for use with sequences for seed 
specific transcription. Expression cassettes of particular interest include transcriptional initiation regions from 
napin genes, particularly Brassica napin genes, more particularly Brassica napus or Brassica campestns 
genes regulating structural genes associated with lipid production, particularly fatty acid production, Including 
60 acyl carrier proteins, which may be endogenous or exogenous to the particular plant, such as spinach acyl 
carrier protein, Brassica acyl carrier protein, acyl carrier protein, either napus or campestns, Cuphea acyl 
carrier protein,, acetyl transacylase, malonyl transacylase. p-ketoacyt synthases I and II. thioesterase, 
particularly thio esterase II. from plant, mammalian, or bacterial sources, for example rat thioesterase II. acyl 
ACP, or phospholipid acyl desaturases. 
65 The following examples are offered by way of illustration and not by way of limitation. 
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EXPERIMENTAL 
Materials and Methods 

. . 5 

Cloning Vectors 

Cloning vectors used include the pUC vectors, pUC8 and pUC9 (Vielra and Messing, Gene (1982) 
19:259-268); pUC18 and pUC19 (Norrander et a).. Gene (1983) 26:101-106; Yantsch^enon et aL. Gene (1985) " 
33:103-119). and analogous vectors exchanging chloramphenicol resistance (CAM) as a marker for "the 
ampicillin resistance of the pUC plasmids described above (pUC-CAM [pUC12-Cm. pUC.13-Cm] Buckley, p.. . . 10 
Ph.D. Thesis. U.C.S.D., CA 1985). The multiple cloning sites of pUC18 and pUC19.yectors were exchanged with 
those of pUC-CAM to create pCGN565 and pCGN566 which are CAM resistant. Also used were pUC1 18 and 
pUC119, which are respectively, pUC18 and pUC19 with the intergenlc region of M13, from an HglAI site at 
5465 to the Ahalll site at 5941. inserted at the Ndel site of pUC. (Available from Vielra J. and Messing; J. 
Waksman Institute, Rutgers University, Rutgers. NJ.) 15 

Materials . 

Terminal deoxynucleotide transferase (TDT). RNaseH, E. coil DNA porymerase, T4 kinase, and restriction 
enzymes were obtained from Bethesda Research Laboratories; E. coli DNA Hgase was obtained from New 
England Biolabs; reverse transcriptase was obtained from Life Sciences, Inc.; Isotopes were obtained from 20 
Amersham; X-gal was obtained from Bachem, Inc. Torrance, CA. 

Example I 

Construction of a Napin Promoter 25 

There are 298 nuclotldes upstream of the ATG start codon of the napin gene on the pgN1 clone, a 3.3 kb 
EcoRI fragment of B. napus genomic DNA containing a napin gene cloned into pUC8 (available from Marti 
Crouch, University of Indiana). pgN1 DNA was digested with EcoRI and Sstl and ligated to gcoRI/Sstl digested 
pCGN706. (pCGN706 is an Xhol/Pstl fragment containing 3' and polyadenylatiori sequences of another napin 
cDN A clone pN2 (Crouch et a]., 1 983 supra) cloned in pCGN566 at the Sail and Pstl sites.) The resulting clone 30 
pCGN707 was digested with Sail and treated with the enzyme Ba]31 to remove some oXthe coding region of 
the napin gene. The resulting resected DNA was digested with SmaJ after the Bal31 treatment and rellgated. 
One of the clones, pCGN7l3, selected by size, was subcloned by Eco RI and Bamjj jl digestion Into both 
EcoRI/BamHI digested pEMBL18 (Dente et a!.. Nucleic Acids Res. (1983) 11:1645-1655) and pUC1 18 to give 
E418 and E41 18 respectively. The extent of Bal31 digestion was conformed by Sanger dldeoxy sequencing of 35 
E418 template. The Bal3t deletion of the promoter region extended only to 57 nucleotides downstream of the 
start codon, thus containing the 5' end of the napin coding sequence and about 300 bp of the 5' non-coding 
region. E4118 was tailored to delete all of the coding region of napin including the ATG. start codon by In vitro 
mutagenesis by the method of Zoller and Smith ( Nucleic Acidjs Res. (1982) lb;6487r6500) using an . 
oligonucleotide primer 5'-GATGTmGTATGTGG6CCCCTAGGAGATC-3'. Screening for the appropriate 40 
mutant was done by two transformations into E. coli strain JM83 (Messing J., In: Recombinant DNA Technical 
Bulletin, NIH Publication No. 79-99, 2 No. 2, 1979, pp 43-48) and SmaJ digestion of putative franslormants. The 
resulting napin promoter clone is pCGN778 and contains 298 nucleotides from the Eco RI site of pgNI to the A , 
nucleotide just before the ATG start codon of napin. The. promoter region was : subclohed. Into a 
chloramphenicol resistant background by digestion with EcoRI and Bam HI and ligation to EcoRI/ Bam HI .45 
digested pCGN565 to give pCGN779c. . . ; ; . 

Extension of the Napin Promoter Clone . .. 

pCGN779c contains only 298 nucleotides of potential 5'-regulatory sequence. The napin promoter was 
extended with a 1 .8 kb fragment found upstream of the 5'-EcoRI site on the original XBnNa done. The ~ 3.5 kb 50. 
Xhol fragment of XBnNa (available from M. Crouch), which Includes the napin region, was subcloned into 
Sail-digested pUC119 to give pCGN930. A Hindlll site close to a 5' Xhol site was used to subclone the 
Hindlll/EcoRI fragment of pCGN930 into Hindlll/EcoRl digested Bluescript + (Vector Cloning Systems, San 
Diego, CA) to give pCGN942. An extended napin promoter was made by ligating pCGN779c digested with 
Eco RI and Pstl and pCGN942 digested with Eco RI and Pstl to make pCGN943. This promoter contains ~2.1 55 
kb of sequence upstream of the original ATG of the napin gene contained on XBnNa. A partial sequence of the 
promoter region is shown in Figure 1. 

Napin Cassettes 

The extended napin promoter and a napin 3'-regulatory region is combined to make a napin cassette for 60 
expressing genes seed-specifically. The napin 3-region used is from the plasmid pCGNlS24 containing the 
Xhol/EcoRI fragment from pgN1 (Xho l site is located 18 nucleotides from the stop codon of the riapin gene) 
subcloned into EcoRI/SaJI digested pCGN565. Hindlll/Pstl digested pCGN943 and pCGN1924 are ligated to 
make the napin cassette pCGN744. with unique cloning sites Smal, Sail, and.Pstl for inserting genes. 

65 
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Construction of cDNA Library from Spinach Leaves 

Total RNA was extracted from young spinach leaves in 4M guanidine thiocyanate buffer as described by 
Facciotti et aL ( Biotechnology (1985) 3:241-246). Total RNA was subjected to oligo(dT)-cellulose column 
chromatography two times to yield pory(A) + RNA as described by Maniatis et aL, (1982) Molecular Cloning: A 

5 laboratory Manual , Cold Spring Harbor Laboratory, New York. A cDNA library was constructed In pUC13-Cm 
aaccording to the method of Gubler and Hoffman, ( Gene (1983) 25:263-269) with slight modifications. RNasin 
was omitted in the synthesis of first strand cDNA as rt interfered with second strand synthesis if not completely 
removed, and dCTP was used to tail the vector DNA and dGTP to tail double-stranded cDNA instead of the 
reverse as described in the paper. The annealed cDNA was transformed to competent E. coli JM83 (Messing 

10 (1979) supra) cells according to Hanahan (J. Mol. Biol. (1983) 166:557-580) and spread onto LB agar plates 
(Miller (1972) Experiments in Molecular Genetics. Cold Spring Harbor Laboratory. Cold Spring Harbor, New 
York) containing 50 pg/ml chloramphenicol and 0.005% X-GaJ. 

Identification of Spinach ACP I cONA 

15 A total of approximately 8000 cDNA clones were screened by performing Southern blots (Southern, J. Mol. 
Biol. (1975) 98:503) and dot blot (described below) hybridizations with clone analysis DNA from 40 pools 
representing 200 cDNA clones each (see below). A 5' end labeled synthetic oligonucleotide (ACPP4) that is at 
least 66Vo homologous with a 16 amino acid region of spinach ACP-I (5'-GATGTCTTGAGCCTTGTCCTCATC- 
CACATTGATACCAAACTCCTCCTC-3') is the complement to a DNA sequence that could encode the 16 amino 

20 acid peptide glu-glu-glu-phe-gly-ile-asn-vaJ-asp^fu^^ residues 49-64 of spinach ACP-I 

(Kuo and Ohlrogge. Arch. Blochem. Biophys. (1984) 234:^0-296) and was used for an ACP probe. 

Clone analysis DNA for Southern and dot blot hybridizations was prepared as follows. Transformants were 
transferred from agar plates to LB containing 50 pg/ml chloramphenicol In groups of ten clones per 10 ml 
media. Cultures were incubated overnight in a 37° C shaking incubator and then diluted with an equal volume of 

25 media and allowed to grow for 5 more hours. Pools of 200 cDNA clones each were obtained by mixing 
contents of 20 samples. DNA was extracted from these cells as described by Birnboim and Doly (Nucleic 
Acids Res. (1979) 7:1513-1523). DNA was purified to enable digestion with restriction enzymes by extractions 
with phenol and chloroform followed by ethanol precipitation. DNA was resuspended in sterile, distilled water 
and 1 jig of each of the 40 pooled DNA samples was digested with EcoRI and Hindlll and electrophoresed 

30 through 07°/o agarose gels. DNA was transferred to nitrocellulose filters following the blot hybridization 
technique of Southern. 

ACPP4 was 5' end-labeled using y- 32 P dATP and T4 kinase according to the manufacturer's specifications. 
Nitrocellulose filters from Southern blot transfer of clone analysis DNA were hybridized (24 hours, 42° C) and 
washed according to Berent et aL ( BioTechniques (1985) 3508-220). Dot blots of the same set of DNA pools 

35 were prepared by applying 1 jig of each DNA pool to nytorTmembrane filters in 0.5 M NaOH. These blots were 
hybridized with the probe for 24 hours at 42°C in oWb formamide/1% SDS/1 M NaCL, and washed at room 
temperature in 2X SSC/0.1% SDS (1X SSC = 0.15M NaCI; 0.015M Na citrate; SDS-sodium dodecyisulfate). 
DNA from the pool which was hybridized by the ACPP4 oligoprobe was transformed to JM83 cells and plated 
as above to yield individual transformants. Dot blots of these individual cDNA clones were prepared by 

40 applying DNA to nitrocellulose filters which were hybridized with the ACPP4 oligonucleotide probe and 
analyzed using the same conditions as for the Southern blots of pooled DNA samples. 

Nucleotide Sequence Analysis 
The positive clone, pCGNISOL, was analyzed by digestion with restriction enzymes and the following partial 
45 map was obtained. 

pUC13-Cm | -35- 1 2*18 |-63~| 152 | -200 | 
* * 
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H H N P Xh E SXBSmSsE** 



H-Hindlll N-Ncol . P-PvuII Xh-Xhol 




55 E-EcoRI S-Sall X -Xbal Sm-Smal 

-former PstI 
with tailing 



B- Bam HI Ss-SatI *-former PstI site destroyed 



**polylinker with available restriction sites indicated 

The cDNA clone was subcloned into pUC118 and pUC119 using standard laboratory techniques of 
restriction, ligation, transformation, and analysis (Maniatis et aL, (1982) supra ). Single-stranded DNA template 
was prepared and DNA sequence was determined using the Sanger dideoxy technique (Sanger et al., (1977) 
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Pro c. Nat. Acad. Sci. USA 74:5463-5467). Sequence analysis was performed using a software package from 

'"'STsorcontains an (approximately) 700 bp cONA insert Including a stretch of A residues at thej 
temdnus which represents the poly(A) tail of the mRNA. An ATG codoh at position 61 Is presumed to encode 
the MET translation initiation codon. This codon is the start of a 411 nucleotide open readingframe^ of whtfh 5 
nucleotides 22SW71 encode a protein whose amino acid sequence corresponds almost perfect* with the pub • 
lished amino acid sequence of ACP-I of Kuo and OMrogge supra as described pr^ous^Jrfadditlon to mature 
pr™he pCGNISOL also encodes a 56 residue transit peptide sequence, as might be expected for a 
nuclear-encoded chloroplast protein. \ 

^GN^Sw^Sr ucted by ngatingpCGNtSOL digested wtth tJndlll/B^HI, pUC8 digested WHh^dlll 
arid BamHI and pUC118 digested with BamHI. The ACP gene from pCGN796 was transferred mto a 
SoraShenico. background 9 by digestion^! BamH. and ligation v*t h Ban* digested pCGN565. The 
resulting pCGN1902 was digested with EcoRI and Smal and ligated to EcoRI/Smal digested pUC118 to give 15 
t>CGN1920 The ACP gene in pCGN1920 was digested at the Ncol site, filled m by treatment with the Wenow 
fragment digested with Smal and religated to form pCGN1919.Thte eliminated trie S'-coding sequences frdm 
SaCP gene and regen-irated the ATG. This ACP gene was flanked with Pstl Srtes by dgesU^ pCGimw 
with EcoRI filling in the site with the Klenow fragment and llgating a Pstt "nker. This clone Is canedpCGN9jte 

ieTCPgene^fpCGN9tf^ 20 
to create pCGN945a%o that, a Smal site (proWed by the pUCTIB) would be at tbeS^end of the ACP- 
fences to facilitate cloning intolhe napln cassette pCGN944. pCGN945a ^ H ^ r ^%^A^ 
ngated to pCGN944 digested with Smal and Pstl to produce the napln ACP cassette pCGN946. The nap.nACP 
cassVtte was then transferred intoThe binSfvector pCGN783 by cloning from the Hindlll s.te to produce 
pCGN948. 

Construction of the Bi nary Vector pCGN783 " 

PCGN783 is a binary plasmid containing the left and right T-DNA borders of A. tumefaclens (Barker etal 
Plan t Mol. Biol. (1983) 2:33JW50); the gentamicin resistance gene of. pPHUI I (Htrsch et al ftesmid (1984) 
12:139-141) the 35S promoter of cauliflower mosaic virus (CaMV) (Gardner etal. Nucleic Acids Res. . 981 
F2871-2890). the kanamycin resistance gene of Tn5 (Jorgenson et al.. infra and MWplff et al., fckl (1985) 
13:355-367) and the 3' region from transcript 7 of P TiA6 (Barker et afc. supra (1983)). 
"To obtain the gentamicin resistance marker, the gentamicin resistance gene was 'Isolated from a 3.1 kb 
EcoRI-Psfl fraqment of pPHUI and cloned into pUC9 yielding pCGN549. The Hfcdlll-BamHI fragment 

P( SgN587 was prepared as follows: The Hindlll-Smal fragment of Tn5 containing the entire strubtural gene 
for APHII (Jorgensonet al.. Mol. Gen. Genel Q 9797177:65) was cloned into pUC8 (Vlelra arid Me^ng. Gene 
(1982719:259). converting the fragment into a Hindlll-EcoRI fragment, since there is an EcoR I site Immediately 
adtecemto the Smal site. The Pstl-EcoRI fraglnent containing the 3'*portion of the APHII gene was then 
combined (with an-EcoRI-BamH.-Sai.-Psti linker into the EcoRI site of pUC7 (pCGN546m Since ^construct 
does not confer KaliaT^clnTSsistaKcelJanamycin resistance was obtained by inserting ^ 3|»fj£l fragment 
of the APHII gene into the BamHI-Pst. site (pCGN546X). This procedure reassembles the APHH gene .so ^that 
Eco site?tiank the gene. aTTaTG codon was upstream from and out of reading frame with the ATG initiator, 
codon of APHII. The undesired ATG was avoided by inserting a Sau3A-Pstl fragment from the 5 -end of. APHIL 
which fraimint lacks the superfluous ATG. Into the BamHI-Pstt site of pCGN546W to provide plasmid 

P °TheSoRI fragment containing the APHII gene was then cloned into the «^orfU*|» of PCGN«1. 
which contains an octopine synthase cassette for expression, to provide pCGN552 CI ATG). 

pCGN451include S anoc.opinecassettewhich contains about 1556 bp of the ^on^mg reg,on fused via 
an EcoRI linker to the 3> non^ding region of the octopine synthase ^V*r P %*rtS2?SrKnfl2S? 
11207 to 12.823 for the 3' region and 13.643 to 15.208 for the 5' region as defined by Barker et al.. Plant Mol. 
Biol (19S3I 2*325 

~Tne 5' fragment was obtained as follows. A small subcloned fragment containing Hie S^end of Jhe coding 
region, as a BamHI-EcoRI fragment was cloned in pBR322 as plasmid pCGN407 The BamHI-EcoRl fragment 
hat an Xmnfiite in the-coding region, while pBR322 has two Xmnl sites pCGN407 was chges ted wtthjg 
resecteTwith Bal31 nuclease and EcoRI linkers added to the fragments. After EcoRI and BamHI dgest.on. the 
fragments wefilize fraclionated.The" fractions Cloned and sequenced. In one case, the entire , coding rag ton, 
and 10 bp of the 5' non-translated sequences had been removed leaving the 5' n^trar^ated-reglon. the 
mRNA cap site and 16 bp of the 5' non-translated region (to a BamHI site) intact Thte small fragment was 
obtained by size fractionation on a 7% acrylamide gel and fragments approximately130 bp longeluted. j> 

This size fractionated DNA was ligated into M13mp9 and several clones sequenced and the sequence 
compared to the known sequence of the octopine synthase gene. The M13 construct was designated p14 
which plasmid was digested with BamHI end EcoR. to provide the smaH fragment w^ h.ch was ngated to a Xho 
to BamHI fragment containing upihTam 5' sequences from P TJA6 (Garflnkel and Nestfer. J. Bactenol . (1980) 
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144:732) and to an EcoRI to Xhol fragment containing the 3' sequences. 

The resulting Xhol fragment was cloned into the Xhol site of a pUC8 derivative, designated pCGN426. This 
piasmid differs from pUC8 by having the sole EcoR I site filled in with DMA polymerase I. and having lost the Pstl 
and Hindlll site by nuclease contamination of Hindi restriction endonuclease. when a Xhol linker was Inserted 

5 into the unique Hind) site of pUC8. The resulting piasmid pCGN451 has a single EcoR I site for the insertion of 
protein coding sequences between the 5' non-coding region (which contains 1,550 bp of 5' non-transcribed 
sequence including the right border of the T-DNA, the mRNA cap site and 16 bp of 5' non-translated 
sequence) and the 3* region (which contains 267 bp of the coding region, the stop codon, 196 bp of 3* 
non-translated DNA, the polyA site and 1 ,153 bp of 3' non-transcribed sequence). pCGN451 also provides the 

10 right T-DNA border. 

The resulting piasmid pCGN451 having the ocs 5' and the ocs 3' in the proper orientation was digested with 
EcoRI and the Eco RI fragment from pCGN551 containing the Intact kanamycln resistance gene inserted into 
the EcoR I site to provide pCGN552 having the kanamycin resistance gene in the proper orientation. 
This ocs/KAN gene was used to provide a selectable marker for the trans type binary vector pCGN587. 

15 The 5' portion of the engineered octopine synthase promoter cassette consists of pTiAS DNA from the Xhol 
at bp 15208-13644 (Barker's numbering), which also contains the T-DNA boundary sequence (border) 
implicated in T-DNA transfer. In the piasmid pCGN587. the ocs/KAN gene from pCGN552 provides a 
selectable marker as well as the right border. The left boundary region was first cloned in M13mp9 as a 
Hindlll-Smal piece (pCGN502) (base pairs 602-2213) and recloned as a Kpnf-EcoRI fragment in pCGN565 to 

20 provide pCGN580. pCGN565 is a cloning vector based on pUC8-Cm, but con taining pUC18 linkers. pCGN580 
was linearized with Bam HI and used to replace the smaller Bglll fragment of pVCK102 (Knauf and Nester, 
Piasmid (1982) 8:45), creating pCGN585. By replacing the smaller Sail fragment of pCGN585 with the Xhol 
fragment from pCGN552 containing the ocs/KAN gene, pCGN587 was obtained. 
The pCGN594 Hindlll -BamH I region, which contains an 5'-ocs-kanamycin-ocs-3' (ocs is octopine synthase 

25 with 5' designating the promoter region and 3' the terminator region, see U.S. application serial no. 775,923, 
filed September 13, 1985) fragment was replaced with the Hindlll -Bam HI polylinker region from pUC18. 

pCGN566 contains the EcoRI-Hindlll Tinker of pUC18 inserted into the EcoRI-Hindlll sites of pUC13-Crrw The 
Hindlll-Bglfl fragment of pNW31C-8,29-1 (Thomashow et aj. t Cell (1980) 19:729) containing ORF1 and -2 of 
pTiA6 was subcloned into the Hindlll -Bam HI sites of pCGN566 producing pCGIM703. 

30 The Sau3A fragment of pCGN703 containing the 3* region of transcript 7 (corresponding to bases 2396-2920 
of pTIA6 (Barker et ah. (1983) supra) was subcloned into the Bam HI site of pUC18 producing pCGN709. The 
EcoRI-Smal polylinker region of pCGN709 was substituted with the EcoRI -Smal fragment of pCGN587, which 
contains the kanamycin resistance gene ( APH3 -II) producing pCGN726. 
The EcoRI-Sall fragment of pCGN726 plus the Bglll-EcoRI fragment of pCGN734 were inserted into the 

35 Bam HI-Sall site of pUC8-Cm producing pCGN738. pCGN726c is derived from pCGN738 by deleting the 900 bp 
EcoRI-EcoRI fragment. 

To construct pCGN167, the Alul fragment of CaMV (bp 7144-7735) (Gardner et a}.. Nucl. Acid Res. (1981 ) 
9:2871-2888) was obtained by digestion with Alul and cloned into the Hindi site of M13mp7 (Messing et a}., 
Nucl. Acids Res. (1981) 9:309-321) to create C614. An Eco RI digest of C614 produced the Eco RI fragment 
40 from C614 containing the 35S promoter which was cloned into the EcoRI site-of pUC8 (Vieira and Messing, 
Gene (1982) 19259) to produce pCGN146. 

To trim the promoter region, the Bglll site (bp 7670) was treated with Bglll and resected with Baj31 and 
subsequently a Bglll linker was attached to the BaJ31 treated DNA to produce pCGN147. 

pCGN148a containing a promoter region, selectable marker (KAN with 2 ATG's) and 3' region, was prepared 
45 by digesting pCGN528 with Bglll and inserting the Bam HI-BgM promoter fragment from pCGN147. This 
fragment was cloned into the Bglll site of pCGN528 so that the Bglll site was proximal to the kanamycin gene of 
DCGN528. 

The shuttle vector used for this construct, pCGN528, was made as follows. pCGN525 was made by 
digesting a piasmid containing Tn5 which harbors a kanamycin gene (Jorgenson et aj., Mol. Gen. Genet. 1979) 

50 177:65) with Hindlll -Bam HI and inserting the Hindlll -BamH I fragment containing the kanamycin gene into the 
Hindlll-BamHI sites in the tetracycline gene of pACYC184 (Oiang and Cohen, J. Bacteriol. (1978) 
134:1141-1156). pCGN526 was made by inserting the Bam HI fragment 19 of pTiA6 (Thomashow et ah, Cell 
(1980) 19:729-739), modified with Xhol linkers inserted into the Smal site, into the BamH I site of pCGN525. 
pCGN528 was obtained by deleting the small Xhol fragment from pCGN526 by digesting with Xhol and 

55 religating. 

pCGN149a was made by cloning the BamH I-kanamycin gene fragment from pMB9KanXXI into the Bam HI 
site of pCGN148a. 

pMB9KanXXi is a pUC4K variant (Vieira and Messing. Gene (1982) 19:259-268) which has the Xho l site 
missing but contains a functional kanamycin gene from Tn903 to allow for efficient selection in Agrobacterium . 

60 pCGN149a was digested with Bglll and Sphl . This small Bglll- Sphl fragment of pCGN149a was replaced with 
* ne BamHI-Sphl fragment from Ml (see below) isolated by digestion with Bam HI and Sphl . This produces 
pCGN167, a construct containing a fuD length CaMV promoter, 1 ATG-kanamycin gene, 3' end and the bacterial 
Tn903-type kanamycin gene. Ml is an Eco RI fragment from pCGN546X (see construction of pCGN587) and 
was cloned into the EcoRI cloning site of M13mp9 in such a way that the Pstl site in the 1 ATG-kanamycin gene 

65 was proximal to the polylinker region of M13mp9. 
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The Hindlil-BamHI fragment in the pCGN167 containing the CaMV-353 promoter, 1ATG-kanamycJn gene 
and the BamH I-fragment 19 of pT!A6 was cloned into the BamH l-Hlng*lH sites of pU C19.creajSng pCGISI976. The 
35S promoter and 3* region from transcript 7 was developed by inserting a 0.7 kb HiridlU rEcoR I fragment of 
pCGM976 (35S promoter) and the 0.5 kb EcoRI-Saul fragment of pCGN709 (transcript 7:3^ Into the Hindi 1 1- Sail 
sites of pCGN566 creating pCGN766c. \' 5 

The 0.7 kb Hindlll-EcoRl fragment of pCGf*766c (CaMV-35S promoter) was ligated to the 1j5 kb EcoRlrSaJI 
fragment in pCGN726c (1ATG-KAN 3' region) followed by insertion Irrtothe Hlndill-SaJI! sitestof p0ci19to 
produce pCGN778. The 2.2 kb region of pCGN778, HIndlll-SaJl fragment containing the.CaMyr35S promoter 

and 1 ATG-KAN-3' region was used to replace the Hindlll-Sajl linker region of pCGty739 fpi produce pCGN783* . 

. . . /0 

Transfer of the Binary Vector pCGN948 into Agrobacterium 

pCGN948 was introduced Into Agrobacterium tumefaciens EHA101 (Hood et aL, J. Bacteriol . (1986) 
168:1291-1301) by transformation. An overnight 2 ml culture of EHA101 was grown in MG/L broth at 30° C. 05 
ml was inoculated into 100 ml of MG/L broth (GarflnKfel and Nester, J. Bacterid. (1980) 144:732-743) and grown 
in a shaking incubator for 5 h at 30° C. The cells were peileted by centrifugattoh at VK. resuspended in 1 ml of 15 
MG/L broth and placed on Ice. Approximately, 1 jig of pCGN948 DNA was placed In 100!uJ ojf MG/L broth to 
which 200 jj-I of the EHA101 suspension was added; the tube containing the DNA^eilrnbc was Immediately 
placed into a dry ice/ethanol bath for 5 minutes. The tube was quick thawed by 5 minutes jn 37° C water bath, 
followed by 2 h of shaking at 30° C after adding 1 ml of fresh MG/L medium. The celts were pelleted and spread 
onto MG/L plates (1.5% agar) containing 100 mg/l gentamicln. Plasmfd DNA was Isolated from individual 20 
gentamicin-resistant colonies, transformed back into E. coli, and characterized by restriction enzyme analysis 
to verify that the gentamicin-resistant EHA101 contained intact copies of pCGN948; Single, colonies are picked 
and purified by two more streakings on MG/L plates containing 100 mg/l gentarnicln. 

Transformation and Regeneration of B. Napus 25 

Seeds of Brassica napus cv Westar were soaked in 95% ethanol for 4 minutes. They were sterilized in 1 % 
solution of sodium hypochlorite with 60 ul of Tween 20" surfactant per 100 ml sterifentsolution. After soaking 
for 45 minutes, seeds were rinsed 4 times with sterile distilled water. They were planted In sterile plastic 
boxes 7 cm wide, 7 cm long, and 10 cm high (Magenta) containing 50 ml of 1/10th concentration of MS 
(Murashige minimal organics medium, Gibco) with added pyridoxine (500 jig/I), nicotinic acid (50 u,g/l), glycine 30 
(200 ug/i) and solidified with 0.6% agar. The seeds germinated and were grown at 22° C In a 16h-8h light-dark 
cycle with light intensity approximately 65 jiEm-Ss- 1 . After 5 days the seedlings were, taken. under sterile 
conditions and the hypocotyls excised and cut into pieces of about 4 mm in length. The hypocotyl segments 
were placed on a feeder plate or without the feeder layer on top of a filter paper on the Soidffied B5 0/1/1 or 
B5 0/ 1 10 medium. B5 0/ 1 /0 medium contains B5 salts and vitamins (Gamborg, Miller and pjirfta. Experimental 35 
Cell Res! (1968) 50:151-158), 3% sucrose, 2,4-dlchlorophenoxyacetic add (1.0 mg/l), pH adjusted to 5.8, and 
the medium is soiidifed with 0.6% Phytagar; B5 0/1/1 Is the same with the addition'of '1,0 jng/I kinetin. Feeder. . 
plates were prepared 24 hours in advance by pipetting 1.0 ml of a stationary phase tobacco suspension culture 
(maintained as described in Rllatti et a}., Molecular General Genetics (1987) 206 :1 9g-1 99) onto B6 0/1/6 or. 
B5 0/1/1 medium. Hypocotyl segments were cut arid placed on feeder plates 24 hours prior to Agrobacterium 40 
treatment. 

Agrobacterium tumefaciens (strain EHA101 x 948) was prepared by Incubating, a single colony of 
Agrobacterium in MG/L broth at 30° C. Bacteria were harvested 1 6 hours later and dilutions of 10* bacteria per 
ml were prepared in MG/L broth. Hypocotyl segments were inoculated with bacteria by placing the segments 
in an Agrobacterium suspension and allowing them to sit for 30-60 minute's, then removing and transferring to 45 
Petri plates containing B5 0/1/1 or 0/1/0 medium (0/1/i intends 1 mg/1 2,4-D and 1 mg/1 kfnetln and 6/1/6 
intends no kinetin). The plates were incubated in low light at 22° C. The co-incubation of bacteria with th? 
hypocotyl segments took place for 24-48 hours. The hypocotyl segments, were removed and placed on 
B5 0/1/1 or 0/1/0 containing 500 mg/l carbeniclllln (kanamycln sulfate at 10, 25, or 50 mg/l was sometimes" 
added at this time) for 7 days in continuous light (approximately 65 u£m- 2 S- 1 ) at 22° C. Tne segments were 50 
transferred to B5 salts medium containing 1% sucrose, 3 mg/l benzytamind purine (BAP) and 1 mg/l zeatiru 
This was supplemented with 500 mg/l carbenicliiin, 10, 25, or 50 mg/l kanamycin sulfate, and solidified with 
0.6% Phytagar (Gibco). Thereafter, explants were transferred to fresh medium every two . weeks. 

After one month green shoots developed from green calli which were selected gn media containing 
kanamyicin. Shoots continued to develop for three months. The shoots were cut from ihe ^alll. when #»ey were, 55 
at least 1 cm high and placed on B5 medium with 1% sucrose, no added growth Wbs^anpes, $00 mg/l 
carbencillin , and solidified with 0.6% phytagar. The shoots continued to grow, and several leaves were 
removed to test for neomycin phosphotransferase It (NPT1I) activity ; Shoots which were postilve^for NPTH . 
activity were placed in Magenta boxes containing B5 0/1/1 medium with 1% sucrose, £ mg/l Jhdolebutyric 
acid, 200 mg/l carbenicillin. and solidified with 0.6% Phytagar. After a few weeks the shoots developed roots 60 
and were transferred to soil, the plants were grown in a growth chamber at 22° C in a 16^8 hours. light-dark 
cycle with tight intensity 220 u£m- 2 s- 1 and after several weeks were transferred to the. greenhouse. . 
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Southern Data 

Regenerated B. napus plants from cocuitivations of Agrobacterium tumefaclens EHA101 containing 
pCGN948 and B. napus hypocotyls were examined for proper integration and embryo-specific expression of 
the spinach leaf ACP gene. Southern analysis was performed using DNA isolated from leaves of regenerated 

5 plants by the method of Dellaporta et ah, ( Rant Mol. Biol. Rep. (1983) V.19-21) and purified once by banding in 
CsCI. DNA (10 u.g) was digested with the restriction enzyme EcoRI. electrophoresed on a 0.7% agarose gel 
and blotted to nitrocellulose (see Maniatis et al.. (1982) supra. ). Blots were probed with pCGN945 DNA 
containing 1.8 kb of the spinach ACP sequence or with the EcoRI/ Hindlll fragment isolated from pCGN936c 
(made by transferring the Hindlll/EcoRI fragment of pC6N930 into pCGN566) containing the napin 5' 

10 sequences labeled with 32 P-dCTP by nick translation (described by the manufac turer, BRL Nick Translation 
Reagent Kit. Bethesda Research Laboratories, Gaithersburg. MD). Blots were prehybridized and hybridized in 
500/0 formamide, 10x Denhardt's SxSSC, 0.1% SDS. 5 mM EDTA, 100 u.g/ml calf thymus DNA and 1 Wo dextran 
sulfate (hybridization only) at 42°C. (Reagents described in Maniatis et aL, (1982) supra. ) Washes were in 
1xSSC. 0.10A> SDS, 30 min and twice in O.lxSSC. 0.1«Vb SDS at 55°C. 

15 Autoradlograms showed two bands of approximately 3.3 and 3.2 kb hybridized in the EcoR I digests of DNA 
from four plants when probed with the ACP gene (pCGN945) indicating proper integration of the spinach leaf 
ACP construct in the plant genome since 3.3 and 32 kb Eco RI fragments are present in the T-DNA region of 
pCGN948. The gene construct was present in single or multiple loci in the different plants as judged by the 
number of plant DNA-construct DNA border fragments detected when probed with the napin 5' sequences. 

20 

Northern Data 

Expression of the integrated spinach leaf ACP gene from the napin promoter was detected by Northern 
analysis in seeds but not leaves of one of the transformed plants shown to contain the construct DNA. 
Developing seeds were collected from the transformed plant 21 days post-enthesis. Embryos were dissected 

25 from the seeds and frozen in liquid nitrogen. Total RNA was Isolated from the seed embryos and from leaves of 
the transformed plant by the method of Crouch et ah, Virology (1985) 140:281-288) and blotted to 
nitrocellulose (Thomas, Proc. Natl. Acad. Sci. USA (1980) 77:5201-5205). Blots were prehybridized, hybridized, 
and washed as described above. The probe was an isolated Pstl /BamH l fragment from pCGN945 containing 
only spinach leaf ACP sequences labeled by nick translation. 

30 An RNA band of -0.8 kb was detected in embryos but not leaves of the transformed plant indicating 
seed-specific expression of the spinach leaf ACP gene. 

Example II 

35 Construction of B. Campestris Napin Promoter Cassette 

A Bglll partial genomic library of B. campestris DNA was made in the lambda vector Charon 35 using 
established protocols (Maniatis et ah. 1982, supra ). The titer of the ampMed library was ~ 1.2 x 10 9 phage/ml. 
Four hundred thousand recombinant bacteriophage were plated at a density of 10 s per 9 x 9 in. NZY plate 
(NZYM as described in Maniatis et al.. 1982, supra) in NZY + 10mM MgS0 4 + 0.9<Vb agarose after adsorption 

40 to DH1 E. cofi cells (Hanahan, Mol. Biol . (1983) 168:557) for 20 min at 37°C. Plates were Incubated at 37°C for 
~ 13 hours, cooled at 4°C for 2.5 hours and the phage were lifted onto Gene Screen Plus (New England 
Nuclear) by laying precut filters over the plates for approximately 1 min and peeling them off. The adsorbed 
phage DNA was immobilized by floating the filter on 1.5 M NaCI, 0.5 M NaOH for 1 min., neutralizing in 1.5 M 
NaCI, 0.5 M Tris-HCI. pH 8.0 for 2 min and 2XSSC for 3 min. Filters were air dried until just damp, prehybridized 

45 and hybridized at 42° C as described for Southern analysis. Filters were probed for napin-containing clones 
using an Xhol/Sall fragment of the cDNA clone BE5 which was isolated from the B. campestris seed cDNA 
library described using the probe pN1 (Crouch et a]., 1983, supra) . Three plaques were hybridized strongly on 
duplicate filters and were plaque purified as described (Maniatis et ah, 1982, supra ). 
One of the clones named lambda CGN1-2 was restriction mapped and the napin gene was localized to 

50 overlapping 2.7 kb Xhol and 2.1 kb Sail restriction fragments. The two fragments were subcloned from lam bda 
CGN1-2 DNA into pCGN789 (a pUC based vector the same as pUC1 19 with the normal poryllnker replaced by 
the synthetic linker -5' GGAATTCGTCGACAGATCtCTGCAGCTCGAGGGATCCAAGCTT 3* (which repre- 
sents the polylinker EcoRI, Sai l, Bglll, Pstl, Xhol , Bam HI, Hindlll). The identity of the subclones as napin was 
confirmed by sequencing. The entire coding region sequence as well as extensive 5' upstream and 3 r 

55 downstream sequences were determined (Figure 2). The lambda CGN1-2 napin gene is that encoding the 
mRNA corresponding to the BE5 cDNA as determined by the exact match of their nucleotide sequences. 

An expression cassette was constructed from the 5'-end and the 3'-end of the lambda CGN1-2 napin gene 
as follows in an analogous manner to the construction of pCGN944. The majority of the napin coding region of 
pCGN940 was deleted by digestion with Sail and religation to form pCGN1800. Single-stranded DNA from 

60 pCGN1800 was used in an in vitro mutagenesis reaction (Adelman et ah, DNA (1983) 2:183-193) using the 
synthetic oligonucleotide 5' GCTTGTTCG CCATGGATATCTTCTGTATGTTC 3'. This oligonucleotide Inserted an 
EcoRV and an Nco1 restriction site at the junction of the promoter region and the ATG start codon of the napin 
gene. An appropriate mutant was identified by hybridization to the oligonucleotide used for the mutagenesis 
and sequence analysis and named pCGN1801. 

65 A 1.7 kb promoter fragment was subcloned from pCGN1801 by partial digestion with Eco RV and ligation to 
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DC6N786 (a pCGN566 chloramphenicol based vector with the synthetic linker described above In place ot the 
normal polylinker) cut with EcoRI and blunted by fill* * resulting expression cassette. pC6N1803 contains 
1 725 kb of napin promoter siquence. and 1.265 kb of napin 3* sequences with the unique cloning sites San. 
Bob Pstl and Xhol in between. Any sequence that requires seed-specific transcription or expression In 
Irasslci. l.e.. a fatty acid gene could be Inserted In this cassette in a manner analogous to that described for 5 
spinach leal ACP and the B. napus napin cassette in Example I. 

B Ott£r sl ed-specific promoters may be isolated from genes encoding proteins Involved hi seed 
triacytglycerol synthesis, such as acyl carrier protein from Brassica seeds. Immature seed were collected from . 10 
Brassica campestris cv. -R-500.* a self-compatible variety of turnip rape. Whole seeds wqre collected at 
itiii^ brresponding approximately to 14 to 28 days after flowering. RMA isolation and preparation of a cDNA 
bank was as described above for the Isolation of a spinach ACP cDNA clone except that the vector used was 
pCQN^ To probe the cDNA bank, the oligonucleotide (S')-ACTTTCTCAACTGTCTCTGGTTTA6CAGC-(3 f ) 
was synthesized using an Applied Biosystems DNA Synthesizer, model 380A* according to manufacturer's 15 
recommendations. This synthetic DNA molecule will hybridize at low stringencies to DNA or RNA sequences 
coding for the amino acid sequence (ala-ala-lys-pro^lu-thr-val-gliHfys-val). This amino acid sequence has 
been reported for ACP isolated from seeds of Brassica napus (Slabas et al.. 7m.mternatlonar Symposium, of 
the Structure and Function of Plant Lipids. University of California. Davis. CA. 1986); ACP from B. campestris 
seed is hiohtv homologous. Approximately 2200 different cDNA clones were analyzed using a colony 20 
SSdwS , tacKS? (Taub P and Thomson. Anal. Blochem. (1982)- 126:222-230) and hybridization 
conditions corresponding to Wood etal.. (Proc. Natl. Acad. Sci . (1985) 82:1585-1588). DNA sequence analysis 
of two cDNA clones showing obvious hybridization to the oligonucleotlde : .prob6 Indicated that one. 
designated pCGNIBcs. indeed coded for an ACP-precursor protein by the considerable nomology of the 
encoded amino acid sequence with ACP proteins described from Brassica napus (SlaBas et a}:. 1980. supra). 25 
Similarly to Example II. the ACP cDNA clone can be used to Isolate a genomic clone from which an expression 
cassette can be fashioned in a manner directly analogous to the B. campestris r^eassette. 

°N?neS P cton es from the 14-28 day post-anthesls B. campestris seed cDNA library (described in the 30 
previous example) were screened by dot blot hybridization of minlprep DNA on Gene Screen Pkis nylon filters. 
Probes used were radioactively labeled first-strand synthesis cDNAs made from the day. 14-28i post-anthesis 
mRNA or from B. c ampestris leaf mRNA. Clones which hybridized strongly to seed cONA.and little or not at all 
to leaf cDNA were catalogued. A number of clones were identified as representingthe seed storage protein 
napin by cross-hybridization with ah Xhol/Sall fragment of pNI (Crouch et a}:. 1983. aipre). a B. campestris 35 
genomic clone as a source of an embryo-specific promoter. . ^^.^ .h= 

Other seed-specific genes may also serve as useful sources of promoters. cDNa clones of crucrfenn. the 
other major seed storage protein of B. napus . have been identified (Simon et sd.. 1985. supha) and couldbe 
used to screen a genomic library for promoters. .u^s, „f An 

Without knowing their specific functions, yet other cDNA clones can be classified as to their level of AO 
expression in seed tissues, their timing of expression (i.e.. when post-anthesis they are expressed) and their 
approximate representation (copy number) in the B. campestris genome. Clones > flttlng-the crttena necessary 
for expressing genes related to fatty acid synthesis or other seed functions can be used to screen a genomic 
libra™ for genomic clones which contain the 5' and 3' regulatory regions necessary for expression. The 
non-coding regulatory regions can be manipulated to make a tissue-specific expression cassette In the 45 

general manner described for the napin genes In previous examples. 

One example of a cDNA clone is EA9. It is highly expressed In seeds and not leaves from B. campestrts. lt 
represents a highly abundant mRNA as shown by cross-hybridization of seven other cDNAs from the .braiy by 
dot blot hybridization. Northern blot analysis ot mRNA Isolated from day 14 i seetf -j and day 21 and 28 
post-anthesis embryos using a 700 bp EcoRI fragment of EA9 as a probe shows that EA9 is highly expressed 
at day 14 and expressed at a much loweTTevel at day 21 and day 28. The restriction map of EA9 was determined 
and the clone sequenced. Identification of a polyadenylation signal and of polyA tails at the .3 -end. of EA9 
confirms the orientation of the cDNA clone and the direction of transcription qfthe mRNA. The partial 
sequence provided here for clone EA9 (Figure 3) can be used to synthesize a probe which will Identify a unique 
class of Brassica seed-specific promoters. . 

It is evident from the above results, that transcription or expression can be obteined specifically in seeds, so 
as to permit the modulation of phenotype or change in properties of a product of seed particularty of the 
embryo It is found that one can use transcriptional initiation regions associated with the transcription of 
sequences in seeds in conjunction with sequences other than the normal sequence to produce endogenous 
or exogenous proteins or modulate the transcription or expression of nucleic acid sequences. In this manner. 60 
seeds can be used to produce novel products, to provide for Improved protein compositions, to modify the 
distribution of fatty acid, and the like. 

All publicatons and patent applications mentioned in this specification are indicative of the level of skill of 
those skilled in the ert to which this invention pertains. All publications and patent applications are ^herein 
incorporated by reference to the same extent as if each individual publication or patent application was 65 
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specifically and individually Indicated to be incorporated by reference. 



5 Claims 



1 A seed comprising an expression cassette, said cassette comprising a seed specific transcriptional 
Wtiafio^nSSTas^uence of Interest under the transcriptional regulation of said initiation region, and a 
SSSierm^.tion region, said expression cassette inserted into the genome of sard seed at 
other than the natural site for said transcriptional initiation region. 

2 A seed according to Claim 1. wherein said sequence of interest is an open read.ng frame encoding an 

"SSigSSsin said sequence of interest is an open reading frame encoding an 
exogenous protein. 

4 Aseed according toClaiml. wherein said sequence of interest encodes a sequence complementary 
to a transcription product of said seed. 

5 A seed according to Claim 1 . wherein said seed is of the Brassica family. 

6. An expression construct comprising a seed specific transcriptional initiator, region, a polylinkerof 
less than about 100 bp having at least two restriction sites for insertion of a DNA sequence to be under 
the transcriptional control of said Initiation region, and a transcriptional termination region. J ^ sequence 
of said polyfinker being other than the sequence of the gene naturally under the transcriptional control of 

^.'Sl'CesSconstruct according to Claim 6. wherein said transcription Initiation region Is from a 
Brassica seed gene. 

8 An e xpression construct according to Claim 7, wherein said gene is a napm gene. 

9. An expression cassette comprising a seed specific transcriptional initiation regfon DNA sequen^ 
of interest; other than the natural sequence joined to said initiation region, to be under the transcnptional 
control of said initiation region, and a transcriptional termination region. 

10. An expression cassette according to Claim 9. wherein said transcnptional inflation region is a 
30 B rassica gene initiation region. . _ o 

"TTArTexpression cassette according to Claim 10, wherein said gene is a napingene. 

1JL An expression cassette according to Claim 10, wherein said sequence of interest is a stmctural gene 

13. An expression cassette according to Claim 12. wherein said structural gene encodes a protein in the 
biosynthetic pathway for fatty acid production. 
35 14.Anexpressioncassetteaccord'^ 

15 A vector comprising an expression cassette according to any of Claims 9 to 14. a prokaryotic 
replication system, and a marker for selection of transformed prokaryotes comprising said marker. 

16.Amethodformod*rfyingthegenotypeofaseedcomprising: „ oecotte 
growing a plant to seed production, wherein cells of said plant comprise an express.on cassette 
40 according to any of Claims 9 to 14, 

whereby seeds are produced of modified genotype. 
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NAP-270 Linear LENGTH - 702 

SfaNI 

Tokl Maeill MboII 

I II' J 

1 AAAGGGATGTGTCTTGGTATGTATCTACGAGTAACMAAGAGAAGATGCAATTGAGTAGTAGAA5GATTTG ' 

19 31 55 . 

36 

Alul Ahalll Fokl EcoRV Hgal 

II II I 

72 AGAGCTTTT45AAGCC34TCA5GTGTGTGCT J TT4ATClTATTGATATCATCCATT4GCGTTGTrrAATGC5 1 
76 82 106 117 130 

Ddel 

143 TCmAGATATGTWCTGTWCTTTCTCAGTGTCTGAATATCTGATAAGTGCAATGTGAGAAAGCCACACC 2 

169 

TaqI 

3apl Hinfl 
I II 

214 AAACCAAAATATTCAAATCTTATATTTTTAATAATGTCGAATCACTCGGAGTTGCCACCTTCTGTGCCAAT . 2 
224 233 

251 

Hinfl MboXZ EcoRl 

I I I 

283 TGTGCTGAATCTATCACACTAAAAAAAACATTTCXTCAAGGTAATGACTTGTGGACTATCGTTCTGAATTC 3 
292 310 351. 

Maell 

356 TCATTAAGTTTTTATTTTTTGAAGTTTAAGTTTTTACCTTCT1TTTTGAAAAATATCGTTCATAAGATGTC 4 

424 . 

sphl 

ScrIT Nlatll Nap (7524) I 

EcoRlI Alul SfaNI Nlalll Maeill 

II II I I I 

427 ACGCCAGGACATGAGCTACACATCACATATTAGCATGCAGA7GCGGACGATTTGTCAC7CACTTCAAACAC 4 
430 442 457 464 480 

432 440 464 

464 
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Sapl PipaCI 
SphI : iMboi 

Tthllixi Ndel Nap (7524) I Maell 

• Alul AvalXI Nlaill Afllil Opnl tfialll 

II II i I Mill I 

438 CTAAAACAGCTTCTCTCTCACACCACACACACATATCCATGCAATAOTTACACGTOATCGCCATGOy^AIC 5 
507 532 539 548 • 555 563 

508 531 539 - 350 

539 553 
• 543 551 

5- 

HphI Mnll 
I I 

569 TCCATTCTCACCTATAMTTAGAGGCTCGGCTTCACTTTTTACTCA7ACCAAAAGXCATCACTACAAAACA 6 
569 583 

MboII Alul Mnll 

II I 

640 TACACAAAIfiGCGAACAAGCTCTTCCTCGTCTCGGCAACTCTCGCCTTGTTCTTCCTTCTCAC 702 

653 659 675 - 

SITE {NAP-270]i Q 



Partial sequence of the promoter region of the *Bntfa napin 
gene. The a tart (ATG) of the open reading frame is underlined. 
Unconfirmed or ambiguous nucleotides in the sequence are 
designated by numbers* 
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Laabda CCS1-2 

NCG-186 Linear LENGTH - 4325 

Xhol 

Taql Hindlll 
Aval AlUl Taql 

II Mil 
1 CTCGAGGCXGTCACTAACATGAAGTTTGACGMGAGCCCAAC'IATGGGftAGCTTATTTCTCTTiTCCAT 69 

2 .32 66 

3 50 

2 

SacI • 

Hhal xbal Alul 
II II 
70 ACTCIAATTGAGCCGTGCGCTCTATCTAGACCAATTAGAATTGATGGAGCTCTAAAGGTTGCTGGCXGT 138 

89 95 119 

121 

Ndel Ndel 
I I. 
139 tttcttgttcatatgattaacttctaaacttgtgtataaatattctctgaaagtgcttcttttggcata 20? 

150 -' 206 

208 tgtaggttgggcaaaaacgaggaagattgcttctcaatttggaagaggatgaacagccgaagaagaaaa 276' 



Sau3AZ 
Ddel 

277 TMGMTAGGCAGTCCTGCTACTCAATGGATCTCAGTCTATAACGGTCGTCGTCCCATGAAACAGAGGT 345 

309 

305 

EcoRV 

346 AAXXCATTTTTTGCATATACACTTTGAAAGTTCCTCACTAACTGTGTAATCTTTTGGTAGATATCACTA 414 

408 
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Hindi 
Hhal 

Haalll 

DdeX 
BstEII 

Ball Haelll Alul 

III .11 
415 CAATGTCGGAGAGACAA3GGCTGMNCMICATATACAAAAGGGAMTGAAGATGGCCrSTTGATTAGCTG 483 

439 . 469 481 . 

438 

439 

439 

440 
436 

Alul Hinfl 

i i 

484 TGTAGCATCAG<^CTAATCTCTGGGCTCTCATCATGGATGCTGGAACTG<3ATTCACTTCTCAAGTT?A 552 . 
498 535 

Mspl 

Hpall Hinfl 

I I 
553 TGAGTTGTCACCGGTCTTCCTACACAAGGTAATAATCAGTTGAAGCAATTAAGAATCAAT'rTGATTTGT 621 

564 606 : 

564 

DdeX 

622 AGTAAACTAAGAAGAACTTACCTTATGTTTtCCCCGCAGGACIGGATTAIGGAACAATGGGAAAAGAAC 690 
629 

Sao I 

Alul Alul Alul 
691 TACTATATAAGCTCCATAGCTGGTTCAGATAACGGGAGCTCTTTAGTTGTTATGTCAAAAGGTTAGTGT 759 

702 710 729 

731 

760 TTAGTGAATAATAAACTTATACCACAAAGTCTTCATTGACTrATTTATATACTTGWGTGAATTGCTAG 828 



Ddel Hinfl 
829 GAACTACTTATTCTCAGCAGTCATACAAAGTGAGTGACTCATTTCCGTTCAAGTGGATAAATAAGAAAT 897 

842 865 
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Xmnl TaqI 
! •• | 

898 GGAAAGAAGATTTTCATGTAXCCTCCATGACAACTGCX6GTAATCGTTGGGGTGTG0TAATGTCGAGGA 966 

908 961 . 

S&U3AZ 

Bell 

I 

967 ACTCTGGCTTCTCTGATC^GTAGGTTTTTGTCXCTTATT^ 1035 

981 
981 

Alul Raai 

1036 CTAATATGATMACTCTGCGTTGTGAAAGGTGGTGGAGCTTGACTTTTTGTACCCAAGCGATGGGATAC 1104 

1074 1087 

Sau3AI Alul 

.1105 ATAGGAGCTGGGAGAATGGGTATAGAATAACATCAATGGCAGCAACTGCGGATC7VAGCAGCTTTCATAT 1173 . 

1155 . 1165 

Hlnfj Raai 

1174 TMGCATACCAAAGCGTAAGATGGTGGATGAAACTCAAGAGACTCTCCGCACCACCGCCTTTCCAAGTA 1242 

1215 1242 

1242 • 

Alul 9au3AX pdel 

1243 CTCATGTCAAGGTTGGTTTCTTTAGCTTTGAACACAGATTTGGATCTX 1311 

1268 1285 1311 

Ddel 

Avail Alul Hinfl Raai 

III II 
1312 TAGGACCTGAGAGCTCTTGGTTGATTmWKCAGGACAAATGGGCGMGAATqTGTACATTGCATCA ; 1380, 

1315 1325 1363 1370 

1319 

1381 ATATGCTATGGCAGGACAGTGXGCTGATAC^CACTTAAGCATCATGTGGaAAGCGAAAGACAATTGGAG 1449 
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Hinfl 
Ddel 
I I 

1450 CGAGACTCAGCGTCGTCATAATftCCAATCAAAGftCGTAAAACCAGACGCAACCTCTTTGGTTGAATGTA 1518 

1456 
1454 

Real 

I 

1519 ATGAAAGGGATGTGTCTTGGTATGTATGTACGAATXACAMAGAGAAGATGGAATTAGTAGTAGAAATA 1587 

1548 

/Qui ECORV 

I I 
1588 TTTGGGAGCTTTTTAAGCCCTTCAAGTGTGCTITT'IATCTTATTGATATCATCCATTTGCGTTGTTTAA 1656 

^ • 

1596 1635 

Xb&X Ddel 
I I 
1657 TGCGTCTCTAGATATGTTCCTATATCTTTCTCAGTGTCTGATAAGTGAAATGTGAGAAAACCATACCAA 172S 

1664 1687 

Hinfl 

I 

1726 ACCAAAATATTCAAATCTTATTTTTAATAATGTTGAATCACTCGGAGTTGCCACCTTCTGTGCCAATTG .1794 

1761 

Hinfl EcoRI 

1 I 

1795 TGCTGAATCTATCACACTAGAAAAAMCATTi'CTTCAAGGTAATGACTTGTGGAGTATGTTCTGAATTC 1863 
1800 1859 

1864 TCATTAAGTTTTTATTX'ICTGAAGTTTAAGTTITTACCTTCTGTTTTGAAATATATCGTTCATAAGATG. 1932 



SphI 

BatNl Alul Sau3AI 

II II 

1933 tcacgccaggacatgagctacacatcgcacatagcatgcagatcaggacgatttgtcactcacttcaaa 2001 

1940 1950 1973 

1971 
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Mai Alul Hh aI Ndal Nb11 gau3AI 

2002 ^CC*AAGAGCTTCTCTCTCACAGCGCACA(&CATATGCA^ 2070 

2006 2012 2028 2036 2042 2058: 

2044 

2071 ATCTCCATOCTCACCTATAAATTAGA^ 2 j 39 



; Alul 

2140 GAACATACACAAMG^GAACMGCTCITCCTCGTCTCGGCAACTCTCGCCTTGTTCTTCCTTCTCACC 2208 

METAl4A8nLysLeuPheleuValS«rAl«ThrI«uMaLeiuPhePhaLeuLeuThr 
2164 

**** Hael 
Sail Mspl 
_ _ Hlnoli Hpali 

f cl n |«I • P B aelIl 

2209 ^'J^^^^^CAGGACGGTTGTGGAAGTCGACCAAGATGATGCCACAAATCCAGCWGCCCATTT 2277 
AanAlaSerVal^rArgThrValValGluValAflpGluAapAspAla^^^^ 

,J241 2271 

2«5 2268 
2240 2268 
2241 2269 



* inf 1 Alul 



2278 AGGATOCCAAAATGTWGAAGGAGWTCAGCAAGCACJ^CACCTGAAAGC* * wuv-«*v.i.taiavivww; 
5 228i r Y y aAr9l,y8G ^h^^GlnAlaGlnHlaiauLyaAlaCyaG^GlnTrpLeuHii 



GCTCCAC 2346 



2327 
2325 



Mapl Avail 

2347 




2364 2382 
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Haalll SacI 

Apal Haelll Alui 

I I | || 

2416 CTCGA6Jl^CAACAACAGGGCCCGCAGCAGAGGCCACCGCTGCTCCAGC*CTGCTGCAAQ»AGCTCCAC 2484 
ValGluA«nGlnGlnGlnGIy*roGlnGlnArgProProL8uLouGlnGlnCy8py«A«nGluLeuHi9 

2438 2449 2479 

2436 i 2481 

BstMI . Hinfl 51 

i II 

2485 CACGMGAGCCACtTTGCGTTTGCCCAACClTGAAAGGAGCAX(XA*AGCCGTTJUliCAACAGATTC5A 25S3 

GlnGluGluProl*uCy8V«lCyBProThri«uLyaGlyAlaSerLy8MaVaaiMGin"GlnXl'eAr& 
2486 25^8 »■ 

2551 

2S54 CAACMCAGGGACAACAAATGCAGGGACAGCAGATGCAGCAAGTGATTAGCCGTATCtACCAGACCGCT 2622 
GlnGlnGlnGlyGlnGlnMSTGlnGlyGlnGlnWETCinGlriVa^^ 

Alal Bstkz 
I I 
, 2623 ACGCACTTACCTAGAGCTTGCAACATCAGGCAAGtTAGCATIfGCCCCTTCCAGAAGACCATGCCTGGG 2691 
•rhrHiBLeuProArgAlaCysAsnlleArgGlnValSerlleCyoProPheGlnlyBthrlifflTProGly 
2639. 268? 

Mspl 

Hpall Xhol 
Ha«III Taql 

Apei Hinfl Aval AccX 

II I || | 

2692 CCCGGCTTCTACTAGATTCCAAACGAATATCCTCGAGAGTGTGTATACCACGGTGATATCAGTGTGGTT 2760 
FroGlyPheTyr , 

2694 2707 2724 2736 

2692 2725 

2694 2724 

2694 

Hindi Real. 
2761 GTTGATGTATGiTAACACTACATAfiTCATGGTGTGTGTTCCATAAATAATGXACTAATGTAATAAGAAC 2829 
2771 2813 

Acd 

2830 TACTCCGTAGACGGTAATAAAAGAGAAGTTTTTTTTTTTACTCTTGCTACTTTCCTATAAAGTGATGAT 2898 
2838 
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Saal 
Raal. 

2899 TMCAACXGATACACCAAAAAGAAAACAATTAATCTATATTCACMTGAAGCAGTACTAGTCTAT-GAA 2967 

'• 2934 . 
2954 

Sau3AT 

',2966 CATGTCAGATTTTCTTTTTCTAAATGTCTAAtTAAGCCTTCAAGGCTMST 3 03 6 

3028 

Sau3AX Sau3AI 
BainHI HiofI Bell 

3037 ATGGSATCCAAOUVAGACTCAAATCTGGTTTTGATCAGATACTTCAAAACTATTTTTGTATTCATTAAA 3105 

3041 3053 3069 

3041 3069 

Hinfl 

3106 TTATGCAAGTGTTCTTTTATTTGGTGAAGACTCTTTAGAAGCAAAGAACGACAAGCAGTAATAAAAAM 3174 

3135 

3175 ACAAAGTTCAGTTTTAAGATTTGrrATTGACTTATTGTCATTTGAAAAATATAGTATGATATTAATATA 3243 



3244 GTTTTAIXTATATAATGCTTGTCTATTCAAGATTTGAGAACAT'TAATATGATACTGTCCACATATCCAA 331 2 



Ndel 

3313 TATACTAAGTTTC^TTTCTGTTCAAACATATGATAAGATGGTCAAATGATTATGAGTTTTGTTATTTAC 3381 

3341 

T»ql Sau3AI 
Alul Rfai 

3382 CTGAAGAAAAGATAAGTGAGCTTCGAGTITCTGAAGGGTACGTGATCtTCATTTCMGGCTAAAAGCGA 3450 

3402 3421 
3405. 3425 

34 51 ATATGACATCACCTAGAGAAAGCCGATAATAGTAAACTCTGTTCTTGGTTTTTGGTTTAATCAAACCGA 3519 
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Mapl 

Mspl Ddal Bpall 
BpaZI Alul Ndol, flinf I 

III I I I 

3320 ACCGGTAGCTGAGTGTCAAGTCAGCAAACATCGCAAACCATATGTCAATTGGTTAGATTCpCGGTTTAA 3588 . 

3522 3528 3560 3575 

3522 3529 3581 

3/561 

M«pi 

Hpall 

35B9 GTTGTAAACCGGTATTTCATTTGGTGAAAACCCTAGAAGCCAGCCANCCTTTTTAAiCTAATTTTTGCA 3657 

3598 
3598 

Hinfl 
Hindi 
Ddel BBtHI 

i Hi 

3658 AACGAGAAGTCACCACACCTCTCCACTAAMCCCTGAACCTTACTGAGAGAAGCAGAGNCANNAAAGAA 3726 

3702 3718 
3715 
3714 

3727 CAAATAAAACCCGAAGATGAGACCACCACGTGCGGCGGGACGTTCAGGGGAO^GAGGAAGASAATGR 3795 



Avail 

Alul Avail 

II I 
3796 CGGCGG5HNTTTGGTGGCGGCGGCGGACGTTTTGGTGGCGGCGGTGGACGTTTTGGTGGCGGCGGTGGA 3864 

3804 3863 
3801 

EcoRV Avail Ddel 

I I I 

3865 CCTTTGGTGGTGGATATCGTGACGAAGGACCTCCCAGTGAAGTCATTGGTTCGTTTACTCTTTTCTTAG 3933 

3880 3892 3930 
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T4£ I* . M Hindlll 

, ?»« Alul Ddei 

1 • II I 

3934 TCGAATCTTATTCTTGCTCTGCTCGTTGTTTTACCGATAAAGCTTAAGACTTTATTGATAAAGTTCTCA .4002 

,3| 37 3976 4000 

3935 3974 

Hinfl Ddal 

i I II 

<CC3 KTTTGAATGTGAATGAACTGTTTCCTGCTTATTAGTGTTCCTI^rSTTTTfiASTTGAATCACTGTCT'rA 4071 

<°04 4023 4039 4069 

Hinfl 
I 

4072 GCACTTTTGTTAGATTCATCTTTGTGTTTAAGTTAAAAGGTAGAAACTTCGTGACTTGTCTCCGTTATG 4140 
4085 

Hindi 

I 

4141 ACAAGGTTAACT TTGTTGGTTATAACAGAAGTTGCGACCTTTCTCCATGCTTGTGAGGGtGATGCIGTG 4209 
4145 

Avail Alul Ddftl S»u3AI 
III | 
4210 GACCAAGCTCTCTCAGGCGAAGATCCCTTACTTCAATGCCCCAATCTACTTGGAAAACAAGACACAGAT 4278 

4210 4217 4222 4231 

TaqI 
Sail 
Pflfcl 

Hindlll Hindi 
Sau3AI Alul AOOI ECORI 

4279 TGGGAMGTTGATGAGATCCAAGCTTGGGCTGCAGGTCGACGAATTC 4325 

.1 4294 4302 4316 4321 

4300 4314 
4313 
4315 
4316 
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EA9-14 Linear LENGTH - 416 



ScrFI 

MaeXII 
HpaXX 

CmuII Mnll Bpall Nsp 

M I I I 



7 21 59 71 

7 

9 

7 

Jwatxx 

Mmel , jjjbdi 

Mbol iB&siZt 
Dpnl GdiXX • 

B»ral Bell Maalll . Cf'rl DpftI 

I III MM 

72 GGA6AATGCACMCTCATCTTGATCACGGG6TAXCTSCGGTTGGATAGGGCCGATCTAAAAACGGATTAAA If: 
82 93 102 120 126 

95 120 
93 • 122 

93 124 

120 

Xholi NlalV 
Seal Niaiv Nlaili 

Mbol rial Mlalll 

Rsal Dpnl tcoRI Avail MhlX' Mbol 

BlnX BatnHX Bin! MnlX Atuz MaaXI Dpnl Bin! tco 

II i I II I Ml I I Ml I I 

143 GTACTGGATCCTCMGAATTCATGGGGACCAAAATGGGGAGAACStGGATXCATGAGGATCAAAAAAGATA 213 
144 149 137 163 169 186 \ ' 202 208 213 

145 131 159 169 190 200 

149 167 198 

14b 131 167 
149 170 

Hlaxxx Rial 
BamX 

214 TCAAGCCTAAACACGGACAAtGTGGTCTTGCCATGAATGCTTCG^ 2S 1 -, 

255 

249 259 

HindXII N«p {7524) I 

BpaXX Aluz BInil NlaXII 

III I I 

285 AATATCCGGTTAAGCTTTAGAAJUiA^TGTGTGtGTTGGTrATAAtTTAAGACTGlHStTGCAtSTAATTTGT 3S> 
291 299 335 346 

297 348 
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" SfaNI 
I 

336 GAMTGGTAAGTTTATGTGATG<^AAAGATTT6ATMMMfiMiL£JWQG3QGC}GGGQG<3 416 
365 



Partial lequenca of tt* cArtfe<iatrla *e#J ctm fiA* #ttfifi<jiet>t 
to identify a genomic clone eoStaining a. promote*. 1&* poly* 
adanylation signal AATAAA is underlined: as th$ P6iyA tails. 
Trie atop codon of the presumed open *9a4*&$ afca&.le underlined 
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